152 research outputs found

    A very low latency pitch tracker for audio to midi conversion

    No full text
    International audienceAn algorithm for estimating the fundamental frequency of a single-pitch audio signal is described, for application to audio-to-MIDI conversion. In order to minimize latency, this method is based on the ESPRIT algorithm, together with a statistical model for partials frequencies. It is tested on real guitar recordings and compared to the YIN estimator. We show that, in this particular context, both methods exhibit a similar accuracy but the periodicity measure, used for note segmentation, is much more stable with the ESPRIT-based algorithm. This allows to significantly reduce ghost notes. This method is also able to get very close to the theoretical mini-mum latency, i.e. the fundamental period of the lowest observable pitch. Furthermore, it appears that fast implementations can reach a reasonable complexity and could be compatible with real-time, although this is not tested is this study

    SPARSE DECOMPOSITION OF AUDIO SIGNALS USING A PERCEPTUAL MEASURE OF DISTORTION. APPLICATION TO LOSSY AUDIO CODING.

    No full text
    International audienceState-of the art audio codecs use time-frequency transforms derived from cosine bases, followed by a quantification stage. The quantization steps are set according to perceptual considerations. In the last decade, several studies applied adaptive sparse time-frequency transforms to audio coding, e.g. on unions of cosine bases using a Matching-Pursuit-derived algorithm. This was shown to significantly improve the coding efficiency. We propose another approach based on a variational algorithm, i.e. the optimization of a cost function taking into account both a perceptual distortion measure derived form a hearing model and a sparsity constraint, which favors the coding efficiency. In this early version, we show that, using a coding scheme without perceptual control of quantization, our method outperforms a codec from the literature with the same quantization scheme. In future work, a more sophisticated quantization scheme would probably allow our method to challenge standard codecs e.g. AAC

    A new model-based algorithm for optimizing the MPEG-AAC in MS-stereo

    No full text
    International audienceIn this paper, a new model-based algorithm for optimizing the MPEG-Advanced Audio Coder (AAC) in MS-stereo mode is presented. This algorithm is an extension to stereo signals of prior work on a statistical model of quantization noise. Traditionally, MS-stereo coding approaches replace the Left (L) and Right (R) channels by the Middle (M) and Sides (S) channels, each channel being independently processed, almost like a monophonic signal. In contrast, our method proposes a global approach for coding both channels in the same process. A model for the quantization error allows us to tune the quantizers on channels M and S with respect to a distortion constraint on the reconstructed channels L and R as they will appear in the decoder. This approach leads to a more efficient perceptual noise-shaping and avoids using complex psychoacoustic models built on the M and S channels. Furthermore, it provides a straightforward scheme to choose between LR and MS modes in each subband for each frame. Subjective listening tests prove that the coding efficiency at a medium bitrate (96 kbits/s for both channels) is significantly better with our algorithm than with the standard algorithm, without increase of complexity

    Calculation of an entropy-constrained quantizer for exponentially damped sinudoids parameters

    No full text
    Technical report, 5 pagesThe Exponentially Damped Sinusoids (EDS) model can efficiently represent real-world audio signals. In the context of low bit rate parametric audio coding, the EDS model could bring a significant improvement over classical sinusoidal models. The inclusion of an additional damping parameter calls for a specific quantization scheme. In this report, we describe a new joint-scalar quantization scheme for EDS parameters in high resolution hypothesis, which is much easier to implement than a vector quantization scheme. A performance evaluation of this quantizer in comparison with a 3-dimensional vector quantizer is proposed in a paper submitted to IEEE Signal Processing Letters named "Entropy-Constrained Quantization of Exponentially Damped Sinusoids Parameters"

    Joint source/channel decoding of scalefactors in MPEG-AAC encoded bitstreams

    No full text
    International audienceThis paper describes a bandwidth-efficient method for improved decoding of MPEG-AAC bitstreams when the encoded data are transmitted over a noisy channel. Assuming that the critical part (headers) of each frame has been correctly received, we apply a soft-decoding method to reconstruct the scalefactors, which represent a highly noise-sensitive part of the bitstream. The damaged spectral data are reconstructed using an intra-frame error concealment method. Two methods for soft decoding of scalefactors are described: blind mode and informed mode. In the latter, a very small amount of additional data is included in the bitstream. At medium SNR, this method provides a significant improvement in perceptual signal quality compared to the classical hard-decoding method

    A quasi-orthogonal, invertible, and perceptually relevant time-frequency transform for audio coding

    No full text
    International audienceWe describe ERB-MDCT, an invertible real-valued time-frequency transform based on MDCT, which is widely used in audio coding (e.g. MP3 and AAC). ERB-MDCT was designed similarly to ERBLet, a recent invertible transform with a resolution evolving across frequency to match the perceptual ERB frequency scale, while the frequency scale in most invertible transforms (e.g. MDCT) is uniform. ERB-MDCT has mostly the same frequency scale as ERBLet, but the main improvement is that atoms are quasi-orthogonal, i.e. its redundancy is close to 1. Furthermore, the energy is more sparse in the time-frequency plane. Thus, it is more suitable for audio coding than ERBLet

    A Novel Approach for Ultra Low-Power WSN Node Generation

    Get PDF
    International audienceWireless Sensor Network (WSN) technology is now emerging with appli- cations in various domains of human life e.g. medicine, environmental monitoring and military surveillance etc. WSN systems consist of low-cost and low-power sensor nodes that communicate efficiently over short distances. It has been shown that power con- sumption is the biggest design constraint for such systems. Currently, WSN nodes are being designed using low-power microcontrollers. However, their power dissipation is still orders of magnitude too high and limits the wide-spreading of WSN technology. In this paper, we propose an alternative approach that uses hardware specialization and power-gating to generate distributed hardware micro-tasks. We target control-oriented tasks running on WSN nodes and present, as a case study, a lamp-switching applica- tion. Our approach is validated experimentally and shows prominent power gains over software implementation on a low-power microcontroller such as the MSP430

    Entropy-constrained quantization of exponentially damped sinusoids parameters

    No full text
    International audienceSinusoidal modeling is traditionally one of the most popular techniques for low bitrate audio coding. Usually, the sinusoidal parameters are kept constant within a time segment but the exponentially damped sinusoidal (EDS) model is also an efficient alternative. However, the inclusion of an additional damping parameter calls for a specific quantization scheme. In this paper, we propose an asymptotically optimal entropy-constrained quantization method for amplitude, phase and damping parameters. We show that this scheme is nearly optimal in terms of rate-distortion trade-off. We also show that damping consumes the smallest part of the total entropy of quantization indexes, which suggests that the EDS model is truly efficient for audio coding

    Adjusting the Spectral Envelope Evolution of Transposed Sounds with Gabor Mask Prototypes

    No full text
    International audienceAudio samplers often require to modify the pitch of recorded sounds in order to generate scales or chords. This article tackles the use of Gabor masks and their capacity to improve the perceptual realism of transposed notes obtained through the classical phase-vocoder algorithm. Gabor masks can be seen as operators that allows the modiïŹcation of time-dependent spectral content of sounds by modifying their time-frequency representation. The goal here is to restore a distribution of energy that is more in line with the physics of the structure that generated the original sound. The Gabor mask is elaborated using an estimation of the spectral envelope evolution in the time-frequency plane, and then applied to the modiïŹed Gabor transform. This operation turns the modiïŹed Gabor transform into another one which respects the estimated spectral envelope evolution, and therefore leads to a note that is more perceptually convincing

    Architectures de contrÎleurs ultra-faible consommation pour noeuds de réseau de capteurs sans fil

    Get PDF
    National audienceCet article traite de la conception d'architectures de contrÎle pour les noeuds d'un réseau de capteurs. En utilisant conjointement la spécialisation du matériel pour réduire la consommation dynamique et la coupure d'alimentation pour les phases de veille, nous proposons un paradigme d'architecture original ainsi que son flot de conception fonctionnel depuis des spécifications de haut-niveau (langage C associé à un langage spécifiquement conçu). Nous illustrons les gains apportés par un flot complet de génération de micro-tùches matérielles par rapport à des implantations logicielles classiques ciblant des micro-contrÎleurs. En combinant la spécialisation matérielle avec des techniques de réduction de puissance statique (power gating), nous réduisons de façon trÚs significative la puissance globale (et l'énergie) dissipée par le systÚme. Les résultats sur des benchmarks issus du domaine des réseaux de capteurs montrent des gains en énergie allant jusqu'à deux ordres de grandeur par rapport aux meilleurs micro-contrÎleurs faible consommation du domaine
    • 

    corecore